On Computing Average Common Substring Over Run Length Encoded Sequences
نویسندگان
چکیده
منابع مشابه
Shortest Unique Substring Queries on Run-Length Encoded Strings
We consider the problem of answering shortest unique substring (SUS) queries on run-length encoded strings. For a string S, a unique substring u = S[i..j] is said to be a shortest unique substring (SUS) of S containing an interval [s, t] (i ≤ s ≤ t ≤ j) if for any i′ ≤ s ≤ t ≤ j′ with j − i > j′ − i′, S[i′..j′] occurs at least twice in S. Given a run-length encoding of size m of a string of len...
متن کاملA fast and simple algorithm for computing the longest common subsequence of run-length encoded strings
a r t i c l e i n f o a b s t r a c t Let X and Y be two strings of lengths n and m, respectively, and k and l, respectively, be the numbers of runs in their corresponding run-length encoded forms. We propose a simple algorithm for computing the longest common subsequence of two given strings X and Y in O (kl + min{p 1 , p 2 }) time, where p 1 and p 2 denote the numbers of elements in the botto...
متن کاملMatching for Run-Length Encoded Strings
1 Motivation Measuring the similarity between two strings, through such standard measures as Hamming distance, edit distance, and longest common subsequence, is one of the fundamental problems in pattern matching. We consider the problem of nding the longest common subsequence of two strings. A well-known dynamic programming algorithm computes the longest common subsequence of strings X and Y i...
متن کاملEdit distance of run-length encoded strings
Let X and Y be two run-length encoded strings, of encoded lengths k and l, respectively. We present a simple O(|X|l+|Y |k) time algorithm that computes their edit distance. 2002 Elsevier Science B.V. All rights reserved.
متن کاملThe Average Common Substring Approach to Phylogenomic Reconstruction
We describe a novel method for efficient reconstruction of phylogenetic trees, based on sequences of whole genomes or proteomes, whose lengths may greatly vary. The core of our method is a new measure of pairwise distances between sequences. This measure is based on computing the average lengths of maximum common substrings, which is intrinsically related to information theoretic tools (Kullbac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Fundamenta Informaticae
سال: 2018
ISSN: 0169-2968,1875-8681
DOI: 10.3233/fi-2018-1743